Search | WHO COVID-19 Research Database

Extraction of knowledge graph of Covid-19 through mining of unstructured biomedical corpora.

Gajendran, Sudhakaran; Manjula, D; Sugumaran, Vijayan; Hema, R.

Comput Biol Chem ; 102: 107808, 2023 Feb.

Article in English | MEDLINE | ID: covidwho-2165189

ABSTRACT

The number of biomedical articles published is increasing rapidly over the years. Currently there are about 30 million articles in PubMed and over 25 million mentions in Medline. Among these fundamentals, Biomedical Named Entity Recognition (BioNER) and Biomedical Relation Extraction (BioRE) are the most essential in analysing the literature. In the biomedical domain, Knowledge Graph is used to visualize the relationships between various entities such as proteins, chemicals and diseases. Scientific publications have increased dramatically as a result of the search for treatments and potential cures for the new Coronavirus, but efficiently analysing, integrating, and utilising related sources of information remains a difficulty. In order to effectively combat the disease during pandemics like COVID-19, literature must be used quickly and effectively. In this paper, we introduced a fully automated framework consists of BERT-BiLSTM, Knowledge graph, and Representation Learning model to extract the top diseases, chemicals, and proteins related to COVID-19 from the literature. The proposed framework uses Named Entity Recognition models for disease recognition, chemical recognition, and protein recognition. Then the system uses the Chemical - Disease Relation Extraction and Chemical - Protein Relation Extraction models. And the system extracts the entities and relations from the CORD-19 dataset using the models. The system then creates a Knowledge Graph for the extracted relations and entities. The system performs Representation Learning on this KG to get the embeddings of all entities and get the top related diseases, chemicals, and proteins with respect to COVID-19.

Subject(s)

COVID-19 , Pattern Recognition, Automated , Humans , Data Mining/methods

Phenonizer: A fine-grained phenotypic named entity recognizer for Chinese clinical texts

Zou, Q.; Yang, K.; Chang, K.; Zhang, X.; Li, X.; Zhou, X..

2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 ; : 3963-3970, 2021.

Article in English | Scopus | ID: covidwho-1722891

ABSTRACT

Biomedical named entity recognition from clinical texts is a fundamental task for clinical data analysis due to the availability of large volume of electronic medical record data, which are mostly in free text format, in real-world clinical settings. Clinical text data incorporates significant phenotypic medical entities, which could be used for profiling the clinical characteristics of patients in specific disease conditions. However, general approaches mostly rely on the coarse-grained annotations (e.g. mentions of symptom terms) of phenotypic entities in benchmark text dataset. Owing to the numerous negation expressions of phenotypic entities (e.g. 'no fever', 'no cough' and 'no hypertension') in clinical texts, this could not feed the subsequent data analysis process with well-prepared structured clinical data. Thus, we constructed a fine-grained Chinese clinical corpus. Thereafter, we proposed a phenotypic named entity recognizer (Phenonizer). The results on the test set show that Phenonizer outperform those methods based on Word2Vec with Fl-score of 0.896. By comparing character embeddings from different data, it is found that character embeddings trained by clinical corpora can improve F-score by 0.0103. Furthermore, the fine-grained dataset enables methods to distinguish between negated symptoms and presented symptoms, and avoids the interference of negated symptoms. Finally, we tested the generalization performance of Phenonier, achieving a superior F1-score of 0.8389. In summary, together with fine-grained annotated benchmark dataset, Phenonier proposes a feasible approach to effectively extract symptom information from Chinese clinical texts with acceptable performance. © 2021 IEEE.

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL